Encoding apparatus and method of same and decoding apparatus and method of same

ABSTRACT

Encoding and decoding systems for MPEG encoding and decoding at a high speed using a parallel processing system, wherein macroblocks to be processed are designated for first to third processors which are made to carry out all processings of encoding, variable length coding, and local decoding of those macroblocks; the variable length coding is carried out after confirming that the variable length coding with respect to the previous macroblock is ended; the variable length coding which was normally sequentially carried out at a specific processor is carried out at all of the processors; and the encoding and local decoding are carried out at all of the processors; whereby the loads are dispersed, the efficiency is improved as a whole, and the processing speed becomes fast.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an encoding apparatus for transformingdata such as video data and audio data, for example, the MPEG method(high quality moving picture encoding system by Moving Picture CodingExperts Group), to a bit stream composed of variable length data, and toa decoding apparatus of the same, more particularly relates to anencoding apparatus and a decoding apparatus for carrying out encodingand decoding at a high speed by parallel processing and methods of thesame.

2. Description of the Related Art

First, an explanation will be made of the MPEG method (MPEG1 andMPEG2)—the standard encoding and decoding system of images currently ingeneral used.

FIG. 1 is a view of the structure of image data in the MPEG method.

As shown in FIG. 1, the image data of the MPEG method is comprised in ahierarchical structure.

The hierarchy is, in order from the top, a video sequence (hereinaftersimply referred to as a “sequence”), groups of pictures (GOP), pictures,slices, macroblocks, and blocks.

In MPEG encoding, the image data is sequentially encoded based on thishierarchical structure so as to be transformed to a bit stream.

The structure of a bit stream of MPEG encoded data is shown in FIG. 2.

In the bit stream of FIG. 2, each picture has j number of slices, andeach slice has i number of macroblocks.

Further, each level of data other than the blocks in the hierarchy shownin FIG. 1 has a header in which an encoding mode etc. are stored.Accordingly, when describing the structure of a bit stream from theheaders of the video sequence, it becomes a sequence header (SEQH) 151,a GOP header (GOPH) 152, a picture header (PH) 153, a slice header (SH)154, a macroblock header (MH) 155, compressed data (MB0) 156 of amacroblock 0, a macroblock header (MH) 157, and compressed data (MB1)158 of a macroblock 1.

Note that the size of the compressed data of a macroblock contained in abit stream is of a variable length and differs depending on the natureof the image etc.

In MPEG decoding, this bit stream is sequentially decoded and the imageis reconstructed based on the hierarchical structure of FIG. 14.

Next, the structure of a processing unit for carrying out the encodingand the decoding by the MPEG method, the processing algorithms, and theflow of the processing will be concretely explained.

First, an explanation will be made of the encoding.

FIG. 3 is a block diagram of the configuration of a general processingunit for carrying out MPEG encoding.

An encoding apparatus 160 shown in FIG. 3 has a motion vector detectionunit (ME) 161, a subtractor 162, a Fourier discrete cosine transform(FDCT) unit 163, a quantization unit 164, a variable length coding unit(VLC) 165, an inverse quantization unit (IQ) 166, an inverse discretecosine transform (IDCT) unit 167, an adder 168, a motion compensationunit (MC) 169, and an encode control unit 170.

In an encoding apparatus 160 having such a configuration, when theencoding mode of the input image data is a P (predictive coded) pictureor B (bidirectionally predictive coded) picture, the motion compensationprediction is carried out in units of macroblocks at the motion vectordetection unit 161, a predicted error is detected at the subtractor 162,DCT is carried out with respect to the predicted error at the discretecosine transform unit 163, and thereby a DCT coefficient is found.Further, when the encoded picture is an I (Intra-coded) picture, thepixel value is input to the discrete cosine transform unit 163 as it is,DCT is carried out, and thereby the DCT coefficient is found.

The found DCT coefficient is quantized at the quantization unit 164 andsubjected to variable length coding together with the motion vector orencoding mode information at the variable length coding unit 165,whereby an encoded bit stream is generated. Further, the quantized datagenerated at the quantization unit 164 is inversely quantized at theinverse quantization unit 166, subjected to IDCT at the inverse discretecosine transform unit 167 to be restored to an original predicted error,and added to a reference image at the adder 168, whereby a referenceimage is generated at the motion compensation unit 169.

Note that, the encode control unit 170 controls the operation of theseparts of the encoding apparatus 160.

Such encoding is generally roughly classified into processing at threeprocessing units, that is, the encoding from the motion vector detectionat the motion vector detection unit 161 to the quantization at thequantization unit 164, the variable length coding in the variable lengthcoding unit 165 for generating the bit stream, and the local decodingfrom the inverse quantization in the inverse quantization unit 166 tothe motion compensation in the motion compensation unit 169.

Next, an explanation will be made of the flow of the processing forcarrying out such encoding and generating an encoded bit stream havingthe structure shown in FIG. 2 by referring to FIG. 4.

FIG. 4 is a flow chart of the flow of the processing for generating abit stream by carrying out MPEG encoding.

When the encoding is started (step S180), a sequence header is generated(step S181), a GOP header is generated (step S182), a picture header isgenerated (step S183), and a slice header is generated (step S184).

When the generation of headers of the different levels is ended,macroblock encoding is carried out (step S185), macroblock variablelength coding is carried out (step S186), and macroblock local encodingis carried out (step S187).

When the encoding is ended for all macroblocks inside a slice, theprocessing routine shifts to the processing of the next slice (stepS188). Below, similarly, when all processing of a picture is ended, theprocessing routine shifts to the processing of the next picture (stepS189). When all processing of one GOP is ended, the processing routineshifts to the processing of the next GOP (step S190). This series ofprocessing is repeated until the sequence is ended (step S181),whereupon the processing is ended (step S192).

A timing chart showing the sequential execution of such encoding by aprocessor, for example, a digital signal processor (DSP), is shown inFIG. 5.

As shown in FIG. 5, in the processor, the processing of the flow chartshown in FIG. 4 is sequentially carried out for every macroblock.

Note that, in FIG. 5, the processing “MBx-ENC” indicates the encodingwith respect to the data of an (x+1)th macroblock x, the processing“MBx-VLC” indicates variable length coding with respect to the data ofthe (x+1)th macroblock x, and the processing “MBx-DEC” indicates thelocal encoding with respect to the data of the (x+1)th macroblock x.

Next, an explanation will be made of the decoding.

FIG. 6 is a block diagram of the configuration of a general processingunit for carrying out the MPEG decoding.

A decoding apparatus 200 shown in FIG. 6 has a variable length decodingunit (VLD) 201, an inverse quantization unit (IQ) 202, an inversediscrete cosine transform unit (IDCT) 203, an adder 204, a motioncompensation unit (MC) 205, and a decode control unit 206.

In a decoding apparatus 200 having such a configuration, a bit stream ofthe input encoded data is decoded at the variable length decoding unit201 to separate the encoding mode, motion vector, quantizationinformation, and quantized DCT coefficient for every macroblock. Thedecoded quantized DCT coefficient is subjected to inverse quantizationat the inverse quantization unit 202, restored to the DCT coefficient,subjected to IDCT by the inverse discrete cosine transform unit 203, andtransformed to pixel space data.

When the block is in the motion compensation prediction mode, the motioncompensation predicted block data is added at the adder 204 to restoreand output the original data. Further, the motion compensation unit 205carries out motion compensation prediction based on the decoded image togenerate the data to be added at the adder 204.

Note that the decode control unit 206 controls the operations of theseunits of the decoding apparatus 200.

Note that such decoding may be generally roughly classified intoprocessing at two processing units, that is, the variable lengthdecoding at the variable length decoding unit 201 for decoding the bitstream and the decoding from the inverse quantization in the inversequantization unit 202 to the motion compensation in the motioncompensation unit 205.

Next, an explanation will be made of the flow of the processing forcarrying out such decoding to decode an encoded bit stream having thestructure shown in FIG. 2 by referring to FIG. 7.

FIG. 7 is a flow chart showing the flow of the processing for generatingthe original image data by carrying out MPEG decoding.

When the decoding is started (step S210), the sequence header is decoded(step S211), the GOP header is decoded (step S212), the picture headeris decoded (step S213), and the slice header is decoded (step S214).

When the decoding of the headers of the different levels is ended,macroblook variable length decoding is carried out (step S215), anddecoding of the macroblock is carried out (step S216).

When the decoding is ended for all macroblocks inside the slice, theprocessing routine shifts to the processing of the next slice (stepS217). Below, similarly, when all processing of one picture is ended,the processing routine shifts to the processing of the next picture(step S218), and when all processing of one GOP is ended, the processingroutine shifts to the processing of the next GOP (step S219). Thisseries of processings is repeated until the sequence is ended (stepS220), whereupon the processing is ended (step S221).

A timing chart of the sequential execution of such decoding by aprocessor, for example, a DSP, is shown in FIG. 8.

As shown in FIG. 8, in the processor, processing of the flow chart shownin FIG. 7 is sequentially carried out for every slice and for everymacroblock inside each slice.

Note that, in FIG. 8, the processing “SH-VLD” indicates the slice headerdecoding, the processing “MBx-VLD” indicates the variable lengthdecoding with respect to the encoded data of the (x+1)th macroblock x,and the processing “MBx-DEC” indicates the decoding with respect to theencoded data of the (x+1)th macroblock x.

Summarizing the disadvantage to be solved by the invention, there is ademand that such encoding and decoding of image and other data beefficiently carried out at a high speed by a parallel processor having aplurality of processors. However, the parallel processors and parallelprocessing methods heretofore have suffered from various disadvantages,so have not been able to carry out high speed processing with asufficiently high efficiency.

Specifically, first, when it is desired to carry out the encoding anddecoding efficiently by parallel processing, there is a disadvantagethat it is difficult to determine how to allocate which steps to theplurality of processors.

Further, in such encoding and decoding, since variable length data is tobe processed, sequential processing must be carried out as the order ofthe data processing in the variable length coding and variable lengthdecoding. For this reason, there is the disadvantage that the parallelprocessing is interrupted at the time of execution of the sequentialprocessing parts or that the processing speed is limited since thesequential processing parts become an obstacle.

Further, if the times for execution of the processing in the processorsare equal, the loads become uniform and equal and efficient processingcan be carried out, but since the processing times of the differentsteps are different, there is a disadvantage that the loads of theprocessors become nonuniform and unequal and therefore high efficiencyprocessing cannot be carried out.

Further, in such a parallel processing method, since in the case of forexample the above image data, the processing with respect to one set ofdata like one video segment is carried out divided among a plurality ofprocessors, it is necessary to carry out synchronization along with thetransfer of the data or control the communication, so there is thedisadvantage that the configuration of the hardware, the control method,etc. become complex.

Further, since the processing to be carried out at the differentprocessors differ, processing programs must be prepared for theindividual processors and the processing must be separately controlledfor the individual processors, so there is the disadvantage that theconfiguration of the hardware, control method, etc. become even morecomplex.

SUMMARY OF THE INVENTION

An object of the present invention is to provide an encoding apparatusand a decoding apparatus having a plurality of processors capable ofcarrying out the encoding and decoding of for example image data at ahigh speed and having simple configurations.

Further, another object of the present invention is to provide anencoding method and a decoding method which can be applied to parallelprocessors having any configurations and capable of carrying out theencoding and decoding of for example image data at a high speed.

According to a first aspect of the present invention, there is providedan encoding apparatus for encoding a data which comprises a plurality ofblock data including a plurality of element data which are sequentiallytransferred in a form of a data stream, the encoding apparatuscomprising a plurality of signal processing devices connected by asignal transfer means on which the data is transferred, each signalprocessing device comprising; an encoding means for encoding a blockdata including a plurality of element data on the signal transfer means,and a variable length coding means for carrying out a variable lengthcoding of the encoded block data and outputting the variable lengthcoded data via the signal transfer means in accordance with the datastream.

According to a second aspect of the present invention, there is providedan encoding method for encoding a data stream having a plurality ofelement data, comprising the steps of; dividing the data stream into apredetermined plurality of block data, successively allotting thedivided plurality of block data to a plurality of signal processingdevices, encoding the allotted block data based on a predeterminedmethod in each of the plurality of signal processing devices,successively carrying out variable length coding on the encoded data inthe same signal processing devices as those for the encoding so that theencoded data for every the block data encoded in the plurality of signalprocessing devices are successively subjected to the variable lengthcoding according to the order in the data stream, and successivelyallotting new block data to the signal processing devices for which thevariable length coding is ended.

According to a third aspect of the present invention, there is provideda decoding apparatus for decoding encoded and variable length coded datawhich comprises a plurality of block data including a plurality ofelement data in a form of a data stream, the decoding apparatuscomprising a plurality of signal processing devices, each of the signalprocessing devices comprising; a variable length decoding means forsuccessively carrying out variable length decoding on variable lengthcoded block data in accordance with the data stream, and a decodingmeans for decoding the variable length decoded block data.

According to a fourth aspect of the present invention, there is provideda decoding method for decoding a variable length coded data streamobtained by encoding a data stream having a plurality of element datafor every predetermined block data and further carrying out variablelength coding, comprising the steps of; successively allotting thevariable length coded data for every the block data successivelyarranged in the variable length coded data stream to a plurality ofsignal processing devices, successively carrying out variable lengthdecoding on the variable length coded data for every allotted block dataso that the variable length decoding carried out in the plurality ofsignal processing devices is successively carried out according to theorder of the block data in the data stream in each of the plurality ofsignal processing devices, decoding the encoded data for every the blockimage data subjected to the variable length decoding in the same signalprocessing device in each of the plurality of signal processing devices,and allotting variable length coded data of new block data to be decodednext to the signal processing devices for which the decoding is ended.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects and features of the present invention willbecome clearer from the following description of a preferred embodimentgiven with reference to the accompanying drawings, in which:

FIG. 1 is a view of the structure of image data in MPEG encoding;

FIG. 2 is a view of the structure of an MPEG encoded image data bitstream;

FIG. 3 is a block diagram of the configuration of a processing unit forcarrying out the MPEG encoding;

FIG. 4 is a flow chart of the flow of processing for generating a bitstream shown in FIG. 15 by carrying out MPEG encoding;

FIG. 5 is a timing chart of the operation of the processing unit whenMPEG encoding is carried out by sequential processing;

FIG. 6 is a block diagram of the configuration of a processing unit forcarrying out MPEG decoding;

FIG. 7 is a flow chart of the flow of processing for generating a bitstream shown in FIG. 15 by carrying out MPEG decoding;

FIG. 8 is a timing chart of the operation of a processing unit when MPEGdecoding is carried out by sequential processing;

FIG. 9 is a schematic block diagram of the configuration of a parallelprocessing unit of an image encoding/decoding apparatus according to thepresent invention;

FIG. 10 is a flow chart of the processing in the case where an image isencoded by the conventional parallel processing method of in a masterprocessor (first processor) of the parallel processing unit shown inFIG. 9;

FIG. 11 is a flow chart of the processing in the case where an image isencoded by the conventional parallel processing method in slaveprocessors (second to n-th processors) of the parallel processing unitshown in FIG. 9;

FIG. 12 is a timing chart of the state of processing in processors in acase where an image is encoded by the conventional parallel processingmethod in the parallel processing unit shown in FIG. 9;

FIG. 13 is a flow chart of the processing in the case where an image isdecoded by the conventional parallel processing method in the masterprocessor (first processor) of the parallel processing unit shown inFIG. 9;

FIG. 14 is a flow chart of the processing in the case where an image isdecoded by the conventional parallel processing method in slaveprocessors (second to n-th processors) of the parallel processing unitshown in FIG. 9;

FIG. 15 is a timing chart of the state of processing in processors in acase where an image is decoded by the conventional parallel processingmethod in the parallel processing unit shown in FIG. 9;

FIG. 16 is a flow chart of the processing in the case where an image isencoded by the parallel processing method according to the presentinvention in the master processor (first processor) of the parallelprocessing unit shown in FIG. 9;

FIG. 17 is a flow chart of the processing in the case where an image isencoded by the parallel processing method according to the presentinvention in slave processors (second to n-th processors) of theparallel processing unit shown in FIG. 9;

FIG. 18 is a timing chart of the state of processing in processors in acase where an image is encoded out by the parallel processing methodaccording to the present invention in the parallel processing unit shownin FIG. 9;

FIG. 19 is a flow chart of the processing in a case where an image isdecoded by the parallel processing method according to the presentinvention in the master processor (first processor) of the parallelprocessing unit shown in FIG. 9;

FIG. 20 is a flow chart of the processing in a case where an image isdecoded by the parallel processing method according to the presentinvention in slave processors (second to n-th processors) of theparallel processing unit shown in FIG. 9; and

FIG. 21 is a flow chart of the state of processing in processors in acase where an image is decoded by the parallel processing methodaccording to the present invention in the parallel processing unit shownin FIG. 9.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

An explanation will be made next of a preferred embodiment of thepresent invention by referring to FIG. 9 to FIG. 21.

In the following embodiment, the present invention will be explained bytaking as an example an image encoding/decoding apparatus carrying outparallel processing by a plurality of processors to encode and decode amoving picture by MPEG2.

Note that, as the units of processing when carrying out the parallelprocessing of the MPEG encoding and decoding, any of the levels shown inFIG. 1 or a pixel can be considered, but in the following embodiment,the explanation will be made of a case where a macroblock is selected asthe unit of parallel processing.

When using a macroblock as the unit of parallel processing, theencoding, local decoding, and decoding can be executed in parallelinside one slice, but it is necessary to sequentially execute thevariable length coding and variable length decoding. This is because, invariable length coding and variable length decoding, the compressed dataof the macroblock has a variable length and the header position of thecompressed data of a macroblock on the bit stream is not determineduntil the variable length coding or the variable length decoding of themacroblock immediately before this is completed.

Note that the same limitation applies in the case where the slice isused as the unit of parallel processing.

First Image Encoding/Decoding Apparatus

First, an explanation will be made of an image encoding/decodingapparatus of the related art for carrying out the encoding and decodingof an image as mentioned above by parallel processing.

FIG. 9 is a schematic block diagram of the configuration of a parallelprocessing unit of an image encoding/decoding apparatus.

As shown in FIG. 9, the parallel processing unit 9 of the imageencoding/decoding apparatus has n number of processors 2-1 to 2-n, amemory 3, and a connection network 4.

First, an explanation will be made of the configuration of this parallelprocessing unit 9.

The n number of processors 2-1 to 2-n are processors for independentlycarrying out predetermined processing. Each processor 2-i (i=1 to n) hasa program read only memory (ROM) or program random access memory (RAM)storing a processing program to be executed and a RAM for storing dataetc. regarding the processing. The processor 2-i carries out thepredetermined processing according to the program stored in the programROM or program RAM in advance.

Note that, in the present embodiment, it is assumed that n=3, that is,the parallel processing unit 9 has three processors 2-1 to 2-3.

Further, in the following explanation, the description will be made ofonly the processing concerning the encoding and decoding of the imagedata by the processors 2-1 to 2-n, but the processing for controllingthe operation of the entire parallel processing unit 9 is carried out inone of the processors 2-i (i=1 to n) or in each of the n number ofprocessors 2-1 to 2-n in parallel. By this control operation, theprocessors 2-1 to 2-n carry out the processing as will be explainedbelow in association or in synchronization.

The memory 3 is a common memory of the n number of processors 2-1 to2-n. The image data to be processed and the data of the processingresult are stored in the memory 3. Data is appropriately read andwritten by n number of processors 2-1 to 2-n.

The connection network 4 is a connection portion for connecting the nnumber of processors 2-1 to 2-n and the memory 3 to each other so thatthe n number of processors 2-1 to 2-n operate in association or the nnumber of processors 2-1 to 2-n appropriately refer to the memory 3.

Next, an explanation will be made of the processing in each processor2-i (i=1 to 3) and the processing of the parallel processing unit 9where the parallel processing unit 9 having such a configuration isencoding a moving picture as mentioned above.

First, an explanation will be made of the processing in each processor2-i.

In the parallel processing unit 9, the variable length coding of themacroblocks is allotted to one processor (hereinafter, this processorwill be referred to as the “master processor”) in a fixed manner andthat processor made to sequentially execute the processing, and theencoding and the local decoding are allotted to other processors(hereinafter, these processors will be referred to as “slaveprocessors”) and those processors made to execute the parallelprocessing. In the parallel processing unit 9 shown in FIG. 9, the firstprocessor 2-1 is made the master processor, and the second and the thirdprocessors 2-2 and 2-3 are made the slave processors.

First, the first processor 2-1 serving as the master processor carriesout the processing as shown in the flow chart of FIG. 10.

Namely, when the encoding is started (step S10), the sequence header isgenerated (step S11), the GOP header is generated (step S12), thepicture header is generated (step S13), and the slice header isgenerated (step S14).

When the generation of the slice header is ended, the master processoractivates the slave processors (step S15) and enters into a statewaiting for the end of the encoding in the slave processors (step S16).

When the encoding of the macroblocks in the slave processors is ended(step S16), the variable length coding of those macroblocks is started(step S17). Note that this variable length coding must be sequentiallyexecuted due to the limitation as mentioned above. Accordingly, even ifthe encoding of the macroblock 1 is ended before the encoding of themacroblock 0, the processor 0 first carries out the variable lengthcoding of the macroblock 0 without fail.

The master processor repeats this procedure until all processing insidea slice is ended (step S18). When all processing inside the slice isended, it waits for the end of all processing in the slave processors(step S19).

Below, similarly, when all processings of one picture are ended, theprocessing routine shifts to the processing of the next picture (stepS20), and when the processing of all pictures of 1GOP are ended, theprocessing routine shifts to the processing of the next GOP (step S21).Then, when these processings are repeated until the sequence is ended(step S22), the processing is ended (step S23).

Next, the second and third processors 2-2 and 2-3 serving as the slaveprocessors carry out the processing as shown in the flow chart of FIG.11.

Namely, when started by the processing of step S15 in the masterprocessor and starting the encoding (step S30), first each of theprocessors acquires the number of the macroblock to process (step S31)and encodes that macroblock (step S32).

When the encoding is ended, the slave processors wait for the end of thevariable length coding in the master processor (step S33). When thevariable length coding is ended, they carry out the local decoding (stepS34).

This procedure is repeated until all processing inside a slice are ended(step S35). When all processing inside the slice is ended (step S35),the processing of the slave processors is ended (step S36).

Note that, the programs by which the master processor and slaveprocessors carry out the processing are stored in advance in the programROMs or the program RAMs provided with respect to the processors 2-i.The processors 2-i operate in accordance with these programs so as tocarry out these processings.

Next, an explanation will be made of the operation of the parallelprocessing unit 9 when encoding a moving picture by referring to FIG.12.

FIG. 12 is a timing chart of the state of the encoding in the threeprocessors 2-1 to 2-3.

Note that, in FIG. 12, the processing “MBx-ENC” indicates the encodingwith respect to the (x+1)th macroblock x (step S32 in FIG. 11), theprocessing “MBx-DEC” indicates the local decoding with respect to the(x+1)th video segment x (step S34 in FIG. 11), and the processing“MBx-VLC” indicates the variable length coding with respect to the(x+1)th video segment x (step S17 in FIG. 10).

As shown in FIG. 12, when the encoding is started, first the secondprocessor 2-2 and the third processor 2-3 carry out the encoding MB0-ENCand MB1-ENC of the macroblock 0 and the macroblock 1.

When the encoding MB0-ENC of the macroblock 0 in the second processor2-2 is ended, the first processor 2-1 carries out the variable lengthcoding MB0-VLC with respect to the encoded data.

The encoding MB1-ENC of the macroblock 1 in the third processor 2-3 isended while the variable length coding MB0-VLC of the macroblock 0 isbeing carried out in the first processor 2-1, therefore, the firstprocessor 2-1 subsequently carries out the variable length codingMB1-VLC with respect to the encoded data of the macroblock 1.

On the other hand, in the second processor 2-2, when the variable lengthcoding MB0-VLC with respect to the macroblock 0 is ended in the firstprocessor 2-1, the local decoding MB0-DEC with respect to that data iscarried out. Then, when this local decoding MB0-DEC is ended, theencoding MB2-ENC with respect to the next macroblock 2 is carried out.

Also in the third processor 2-3, similarly, when the variable lengthcoding MB1-VLC with respect to the macroblock 1 is ended in the firstprocessor 2-1, the local decoding MB0-DEC with respect to that data iscarried out. Then, when this local decoding MB0-DEC is ended, theencoding MB3-ENC with respect to the next macroblock 3 is carried out.

Below, similarly, in the first processor 2-1, the second processor 2-2,or the third processor 2-3, when the encoding MBx-ENC of the encoding ofthe macroblock to be processed next is ended, the decoding MBx-VLC ofthe encoded data is sequentially carried out.

Further, in the second processor 2-2 and the third processor 2-3, whenthe variable length coding MBx-VLC is ended in the first processor 2-1,the local encoding MBx-DEC with respect to the macroblock thereof iscarried out, and after the end of the processing, the encoding MBx-ENCwith respect to the next macroblock x+1 is subsequently carried out.

Note that the variable length coding can be divided into the phase forgenerating the variable length data from the fixed length data by tableconversion and the phase for combining the variable length data togenerate the bit stream. These two phases may be sequentially executed,or only the latter phase may be sequently executed and the former phasebe executed in parallel. Note that a buffer memory becomes necessarybetween the former phase and the latter phase in the latter method.

Next, an explanation will be made of the processing in each processor2-i (i=1 to 3) when decoding the moving picture as mentioned above inthe parallel processing unit 9 and of the operation of the parallelprocessing unit 9.

First, an explanation will be made of the processing in each processor2-i.

In the parallel processing unit 9, the variable length decoding ofmacroblocks is allotted to one processor (hereinafter this processorwill be referred to as the “master processor”) in a fixed manner andthat processor made to sequentially execute the processing. The decodingis allotted to the other processors (hereinafter, these processors willbe referred to as the “slave processors”) and the slave processors madeto carry out the parallel processing. In the parallel processing unit 9shown in FIG. 9, the first processor 2-1 is made the master processor,and the second and the third processors 2-2 and 2-3 are made the slaveprocessors.

First, the first processor 2-1 serving as the master processor carriesout the processing as shown in the flow chart of FIG. 13.

Namely, when the decoding is started (step S40), the sequence header isdecoded (step S41), the GOP header is decoded (step S42), the pictureheader is decoded (step S43), and the slice header is decoded (stepS44).

When the decoding of the slice header is ended, the master processoractivates the slave processors (step S45) and carries out the variablelength decoding with respect to a macroblock (step S46). The masterprocessor repeatedly carries out this variable length decoding (stepS4i6) until this processing is ended for all macroblocks inside theslice.

When the variable length decoding with respect to all macroblocks insidea slice is ended, the master processor waits for the end of allprocessings in the slave processors (step S48). When the processings inthe slave processors are ended (step S48), the processing routine shiftsto the processing with respect to the next picture (step S49).

When the processing of all pictures of one GOP is ended (step S49), theprocessing routine shifts to the processing of the next GOP (step S50).When the processing of all GOPs is ended (step S50), the processingroutine shifts to the processing of the next sequence (step S51). Thisseries of processing is repeated until all sequences are ended (stepS51), whereby the processing is ended (step S52).

Next, the second and third processors 2-2 and 2-3 serving as the slaveprocessors carry out the processing as shown in the flow chart of FIG.14.

Namely, when started by the processing of step S45 in the masterprocessor and starting the decoding (step S60), first each slaveprocessor obtains the number of the macroblock to be processed (stepS61) and waits for the end of the variable length decoding of therelated macroblock at step S46 at the master processor (step S62).

Next, when the variable length decoding is ended, the slave processordecodes the macroblock using that data (step S63).

This procedure is repeated until the processing of all macroblocksinside the slice is ended (step S64). When all processing inside theslice is ended (step S64), the processing of the slave processors isended (step S65).

Note that, the programs by which the master processor and slaveprocessors carry out the processing are stored in advance in the programROMs or the program RAMs provided with respect to the processors 2-i.The processors 2-i operate in accordance with these programs so as tocarry out these processings.

Further, when a slice is used as the unit of parallel processing in thevariable length decoding, the header of the next slice on the bit streamcan be found without carrying out the variable length decoding. Thisbecomes possible by finding the slice start code placed at the header ofthe slice by scanning. Accordingly, a processing method of carrying outonly this scanning sequentially and carrying out the other processingcontaining the variable length decoding in parallel is possible.

Next, an explanation will be made of the operation of the parallelprocessing unit 9 when decoding a moving picture by referring to FIG.15.

FIG. 15 is a timing chart of the state of the decoding in the threeprocessors 2-1 to 2-3.

Note that, in FIG. 15, the processing “MBx-VLD” indicates the variablelength decoding with respect to the (x+1)th macroblock x (step S46 inFIG. 13), and the processing “MBx-DEC” indicates the decoding withrespect to the (x+1)th video segment x (step S63 in FIG. 14).

As shown in FIG. 15, when the decoding is started, the first processor2-1 sequentially carries out the variable length decoding from themacroblock 0.

When the variable length decoding of the macroblock 0 is ended in thefirst processor 2-1, the second processor 2-2 carries out the decodingMB0-DEC with respect to this data.

Further, when the variable length decoding of the next macroblock 1 isended in the first processor 2-1, the third processor 2-3 carries outthe decoding MB1-DEC with respect to this data.

Thereafter, the processor which ended the decoding among the secondprocessor 2-2 and the third processor 2-3 fetches the data of the nextmacroblock subjected to the variable length decoding at the firstprocessor 2-1 and carries out the encoding.

In this way, the first image encoding/decoding apparatus divides theprocessing steps of the encoding and decoding into steps able to beprocessed in parallel and steps relating to variable lengthcoding/decoding not able to be processed in parallel and having to beprocessed sequentially, allots the steps for which sequential processingis necessary to the master processor and steps which can be processed inparallel to the slave processors, and then carries out the encoding andthe decoding.

Accordingly, the sequentially input data is sequentially processed atthese three processors 2-1 to 2-3 and transformed to the intendedcompressed and encoded data or the restored image data. By carrying outthe encoding and the decoding by parallel processing in this way, theprocessing can be carried out at a higher speed compared with the usualcase where the processing is carried out by one processor.

Second Image Encoding/Decoding Apparatus

In the first image encoding/decoding apparatus, however, since thesequential processing part (variable length coding and the variablelength decoding) was allotted to a specific processor (first processor2-1) in a fixed manner and that processor made to sequentially executethe processing, there was the disadvantage that the loads becamenonuniform among the three processors 2-1 to 2-3.

In such a case, if the ratio of execution times of the sequentialprocessing part and the parallel processing part were proportional tothe ratio of the numbers of the processors for executing the sequentialprocessing part and the parallel processing part, the loads would becomeuniform and equal, but if not proportional, the loads of the processorswould become nonuniform and unequal resulting in a fall in theperformance.

For example, in the parallel processing of MPEG encoding shown in FIG.12, the load of the variable length coding is relatively light,therefore the first processor 2-1 frequently is idle. This becomes evenmore conspicuous in a parallel processing apparatus having twoprocessors.

Further, also in the parallel processing of the MPEG decoding shown inFIG. 15, since the load of the variable length decoding is relativelylight, the first processor 2-1 becomes idle at the point of time whenone slice's worth of the variable length decoding is ended and until alldecoding in the second processor 2-2 and the third processor 2-3 isended.

Further, in the first image encoding/decoding apparatus, since theprocessing executed at the different processors is different, it isnecessary to separately control the processors and synchronize thetransfer of data and communication, so there also arises a disadvantageof complicated control.

Therefore, an explanation will be made of an image encoding/decodingapparatus according to the present invention, as a second imageencoding/decoding apparatus, which solves such disadvantages, inparticular, which can encode and decode an image at a further high speedand further which can simplify the structure and control method etc.

The hardware structure of the second image encoding/decoding apparatusis the same as that of the first image encoding/decoding apparatusmentioned above.

Namely, the parallel processing unit 1 has the configuration as shown inFIG. 9, i.e., has n number of processors 2-1 to 2-n, a memory 3, and aconnection network 4. Note that these components are the same as thoseof the case of the parallel processing unit 9 of the first imageencoding/decoding apparatus in terms of hardware structure and thereforewill be explained by using the same reference numerals.

Further, the functions and configurations of the n number of processors2-1 to 2-n to the connection network 4 are the same as those of the caseof the parallel processing unit 9 of the first image encoding/decodingapparatus, so explanations thereof will be omitted.

Further, in the case of the parallel processing unit 1 of the secondimage encoding/decoding apparatus as well, the number n of processors is3.

In the case of the parallel processing unit 1 of the second imageencoding/decoding apparatus having the same hardware structure as thatof the parallel processing unit 9 of the first image encoding/decodingapparatus, the method of the encoding and decoding of a moving pictureand the operations of the processors 2-i (i=1 to 3) are different fromthose of the first image encoding/decoding apparatus.

Namely, the programs stored in the program ROMs or the program RAMsprovided for the three processors 2-1 to 2-3 are different from those ofthe case of the first image encoding/decoding apparatus. Due to this,the parallel processing unit 1 of the second image encoding/decodingapparatus carries out processing different from that of the parallelprocessing unit 9 of the first image encoding/decoding apparatus as awhole.

In the second image encoding/decoding apparatus, the processors are madeto divide and execute not only the parallel processing part, but alsothe sequential processing part.

For encoding, in the parallel processing unit 1 of the second imageencoding/decoding apparatus, the processors divide and sequentiallycarry out the variable length coding of the macroblocks. Accordingly,each processor carries out all of the encoding, variable length coding,and local decoding for the macroblock it is in charge of. At this time,when the variable length coding of a certain macroblock is started, theend of the variable length coding is awaited only when the variablelength coding of the previous macroblock has not yet been ended.

Further, for the decoding, in the parallel processing unit 1 of thesecond image encoding/decoding apparatus, the processors divide andsequentially carry out also the variable length decoding of themacroblocks. Accordingly, each processor carries out both of thevariable length decoding and decoding for the macroblock it is in chargeof. At this time, the end of the variable length decoding is awaitedonly when the variable length decoding of a certain macroblock has notyet been ended.

Below, an explanation will be made of the processing in each processor2-i (i=1 to 3) when encoding and decoding a moving picture in theparallel processing unit 1 of the second image encoding/decodingapparatus and of the operation of the parallel processing unit 1.

First, an explanation will be made of the processing in each processor2-i when encoding.

In the parallel processing unit 1 of the second image encoding/decodingapparatus, in the same way as the first image encoding/decodingapparatus mentioned above, one processor is decided on as the masterprocess and the others as the slave processors and made to carry outdifferent predetermined processing. However, the only difference ofprocessing between the master processor and slave processors is that themaster processor generates the headers and starts the slave processors:The encoding, the variable length coding, and the local decodingregarding the actual encoding are carried out at both of the masterprocessor and the slave processors by similar procedures. Namely, themaster processor and the slave processors carry out the processing bydifferent processing procedures, but the main processing part of theencoding is carried out by the same procedure.

Below, an explanation will be made of the processing of each processor.

First, the first processor 2-1 serving as the master processor carriesout the processing as shown in the flow chart of FIG. 16.

Namely, when the encoding is started (step S70), the sequence header isgenerated (step S71), the GOP header is generated (step S72), thepicture header is generated (step S73), and the slice header isgenerated (step S74).

When the generation of the slice header is ended, the master processorstarts the slave processors (step S75).

When the start-up of the slave processors is ended, the master processorcarries out the processing relating to the encoding in the same way asthat by the slave processors.

Namely, first, it acquires the number of a macroblock to be processed(step S76) and encodes that macroblock (step S77).

Next, it confirms that the variable length coding of the previousmacroblock is ended (step S78), carries out the variable length coding(step S79), and, further, carries out the local decoding (step S80).

This procedure is repeated until all processing inside the slice isended (step S81). When all processing inside a slice is ended, the endof all processing in the slave processors is awaited (step S82).

Then, when all processing for one picture is ended, the processingroutine shifts to the processing of the next picture (step S83). Whenthe processing of all pictures of one GOP is ended, the processingroutine shifts to the processing of the next GOP (step S84).

This processing is repeated until the sequence is ended (step S85),whereupon the processing is ended (step S86).

Next, the second and third processors 2-2 and 2-3 serving as the slaveprocessors carry out the processing as shown in the flow chart of FIG.17.

Namely, when started by the processing of step S75 in the masterprocessor and starting the encoding (step S90), first each slaveprocessor obtain the number of the macroblock to be processed (step S91)and encodes that macroblock (step S92).

Next, it confirms that the variable length coding of the previousmacroblock is ended (step S93), carries out the variable length coding(step S94), and further carries out the local decoding (step S95).

This procedure is repeated until all processing inside the slice isended (step S96). When all processing inside the slice is ended, theprocessing in the slave processor is ended (step S97).

Next, an explanation will be made of the operation of the parallelprocessing unit 1 when encoding by the operation of three processors 2-1to 2-3 by such a processing procedure by referring to FIG. 18.

FIG. 18 is a timing chart of the state of the encoding in the threeprocessors 2-1 to 2-3.

Note that the reference symbols showing processings in FIG. 18 are thesame as those shown in FIG. 12, so explanations will be omitted.

As illustrated, when the encoding is started, the three processors 2-1to 2-3 start the encodings MB0-ENC, MB1-ENC, and MB2-ENC of themacroblock 0, macroblock 1, and macroblock 2.

Then, when the encoding MB0-ENC is ended, the first processor 2-1successively carries out the variable length coding MB0-VLC of themacroblock 0 and, further, the local decoding MB0-DEC of the macroblock0. Further, when the local decoding MB0-DEC of the macroblock 0 isended, it starts the processing with respect to the next macroblock,that is, the macroblock 3, from the encoding MB3-ENC.

On the other hand, when the encoding MB1-ENC of the macroblock 1 isended, the variable length coding MB0-VLC of the previous macroblock 0is still being carried out at the first processor 2-1, therefore thesecond processor 2-2 waits for the end of this variable length coding.When this is ended, it starts the variable length coding MB1-VLC of themacroblock 1. Then, when the variable length coding MB1-VLC is ended, itcarries out the local decoding MB1-DEC of the macroblock 1. Further,when the local decoding MB1-DEC of the macroblock 1 is ended, it startsthe encoding MB4-ENC with respect to the next macroblock 4.

Further, in the third processor 2-3, when the encoding MB2-ENC of themacroblock 2 is ended, the variable length coding MB0-VLC and MB1-VLC ofthe previous macroblock 0 and macroblock 1 have not yet been ended,therefore, the end of the processing is awaited. When the variablelength coding of the macroblock 0 and the macroblock 1 is ended, thevariable length coding MB2-VLC of the macroblock 2 is carried out. Whenthe variable length coding MB2-VLC is ended, the local decoding of themacroblock 2 is carried out. Further, when the local decoding MB2-DEC ofthe macroblock 2 is ended, the encoding MB5-ENC with respect to the nextmacroblock 5 is started.

In this way, the processors 2-1 to 2-3 successively select macroblocks xto be processed and carry out the encoding MBx-ENC, variable lengthcoding MBx-VLC, and the local decoding MBx-DEC with respect to themacroblocks x.

By carrying out the processing in this way, the start of the processingneed be awaited for only the variable length coding MBx-VLC when thevariable length coding MB(x−1)-VLC with respect to the previousmacroblock x−1 has not been ended, but the processing can be carried outcompletely in parallel for other portions.

In the variable length coding MBx-VLC thereof as well, the encoding issimultaneously started at the processors 2-1 to 2-3 just at the start ofthe processing as shown in FIG. 18. Therefore, requests for the start ofthe variable length coding are superimposed, and idling occurs in theprocessors 2-2 and 2-3. After this, however, the processing steps in theprocessors will always be offset from each other and therefore suchidling will hardly ever occur. Also in the example shown in FIG. 18, noidling will occur at all in other parts—it will only be necessary towait a little in the variable length coding MB5-VLC of the macroblock 5in the third processor 2-3.

Next, an explanation will be made of the processing in each processor2-i when decoding in the second image encoding/decoding apparatus.

In the case of decoding as well, in the same way as the first imageencoding/decoding apparatus, one processor is decided on as the masterprocessor and the others as the slave processors and made to carry outprocessing different from each other. The master processor, however,differs from the processing of the slave processors only in the pointthat it decodes the headers and starts the slave processors: thevariable length coding and decoding regarding the actual decoding arecarried out by both of the master processor and slave processors bysimilar procedures. Namely, the master processor and the slaveprocessors carry out processing by different processing procedures, butthe main processing part of the decoding is achieved by the sameprocedure.

Below, an explanation will be made of the processing of each processor.

First, the first processor 2-1 serving as the master processor carriesout the processing as shown in the flow chart of FIG. 19.

Namely, when the decoding is started (step S100), the sequence header isdecoded (step S101), the GOP header is decoded (step S102), the pictureheader is decoded (step S103), and the slice header is decoded (stepS104).

Then, when the decoding of the slice header is ended, the masterprocessor starts the slave processors (step S105).

When the start-up of the slave processors is ended, the master processorcarries out processing relating to the decoding in the same way as thatfor the slave processors.

Namely, first, it acquires the number of the macroblock to be processed(step S106), confirms that the variable length decoding of the previousmacroblock is ended (step S107), and carries out the variable lengthdecoding of that macroblock (step S108).

When the variable length decoding is ended, it decodes that macroblock(step S109).

This procedure is repeated until all processing inside the slice isended (step S110). When all processing inside the slice is ended, itwaits for the end of all processing in the slave processors (step S111).

When all processing for one picture is ended, the processing routineshifts to the processing of the next picture (step S112). When theprocessing of all pictures of one GOP is ended, the processing routineshifts to the processing of the next GOP (step S113).

This processing is repeated until the sequence is ended (step S114),whereupon the processing is ended (step S115).

Next, the second and third processors 22 and 2-3 serving as the slaveprocessors carry out the processing as shown in the flow chart of FIG.20.

Namely, when started by the processing of step S105 in the masterprocessor and starting the decoding (step S120), first each slaveprocessor acquires the number of the macroblock to be processed (stepS121), confirms that the variable length decoding of the previousmacroblock is ended (step S122), and then carries out the variablelength decoding of that macroblock (step S123).

Next, when the variable length decoding is ended, it decodes thatmacroblock (step S124).

This procedure is repeated until all processing inside the slice isended (step S125). When all processing inside the slice are ended, theprocessing in the slave processors is ended (step S126).

Next, an explanation will be made of the operation of the parallelprocessing unit 1 when decoding by the operation of the three processors2-1 to 2-3 by such a processing procedure by referring to FIG. 21.

FIG. 21 is a timing chart of the state of the decoding in the threeprocessors 2-1 to 2-3.

Note that reference symbols showing processing in FIG. 21 are the sameas those shown in FIG. 15, so explanations will be omitted.

As illustrated, when the decoding is started, first, the first processor2-1 carries out the variable length decoding MB0-VLD of the firstmacroblock 0.

The second processor 2-2 carries out the processing with respect to themacroblock 1, but since it is necessary to successively carry out theprocessing for every macroblock in variable length decoding, it carriesout the variable length decoding MB1-VLD of the macroblock 1 afterwaiting for the end of the variable length decoding MB0-VLD of themacroblock 0 at the first processor 2-1.

The third processor 2-3 similarly carries out the variable lengthdecoding MB2-VLD of the macroblock 2 after waiting for the end of thevariable length decoding MB0-VLD for the macroblock 0 at the firstprocessor 2-1 and the variable length decoding MB1-VLD for themacroblock 1 at the second processor 2-2.

The first processor 2-1 finishing the variable length decoding MB0-VLDwith respect to the macroblock 0 successively carries out the decodingMB0-DEC with respect to the macroblock 0.

When that decoding MB0-DEC is ended, the processing with respect to thenext macroblock 3 is started. At this time, however, as shown in FIG.21, if the variable length coding MB2-VLD with respect to the previousmacroblock 2 has not been ended, this is waited for before starting andthe variable length decoding MB3-VLD with respect to the macroblock 3.

Below, similarly, the processors 2-1 to 2-3 successively select themacroblocks x to be processed and carry out the variable length decodingMBx-VLD and decoding MBx-DEC with respect to the macroblocks x.

By carrying out the processing in this way, while the start of thevariable length decoding MBx-VLD is delayed when the variable lengthdecoding MB(x−1)-VLD with respect to the previous macroblock x−1 has notbeen ended, the processings can be carried out completely in parallelfor other portions.

In the variable length decoding MBx-VLD thereof as well, the decoding issimultaneously started at the processors 2-1 to 2-3 at the start of theprocessing as shown in FIG. 21, therefore the second processor 2-2 andthe third processor 2-3 are made to wait and the idling occurs in theprocessing, but, thereafter, the processing steps in the processors willalways be offset from each other and such idling will hardly ever occur.Also, in the example shown in FIG. 13, no idling at all occurs in otherprocessing—though the variable length decoding MB3-VLD of the macroblock3 at the first processor 2-1 is made to slightly wait.

In this way, the second image encoding/decoding apparatus, when carryingout MPEG encoding and decoding, the processors can carry out in adispersed manner not only the encoding part, the local decoding part,and the decoding part which can be processed in parallel, but also thevariable length coding part and variable length decoding part which mustbe sequentially processed.

Accordingly, the load of the sequential processing part can be uniformlyand equally dispersed among the processors, and, as shown in FIG. 18 andFIG. 21, the idling time of the processors can be greatly reduced whencompared with the first image encoding/decoding apparatus. As a result,the entire encoding and decoding speed can be greatly improved. Notethat the effect becomes even more pronounced in a parallel processingapparatus having just two processors.

Further, in the parallel processing unit 1 of the second imageencoding/decoding apparatus, each of a plurality of processors 2-1 to2-n carries out a series of encoding and a series of decoding for themacroblock to be processed allotted to it on a continuous basis. Forthis reason, it is possible to synchronize the processors and reduce theload of the data communication etc. Further, as a result, all of theprocessing time can be used for the encoding and decodings. As a result,the loads at the processors substantially become uniform and equal, andthe encoding and the decoding can be carried out efficiently and at ahigh speed.

Further, all processors can be operated substantially under the samecontrol and processing procedure, therefore the hardware configurationbecomes simple.

Further, the present invention provides a scalable parallel processingapparatus not depending upon the number of processors, so can be appliedto parallel processing apparatus of various configurations.

Note that, the present invention is not limited to only the presentembodiment. Various modifications are possible.

For example, in the parallel processing unit of the embodiment, whilethere is only one master processor, but there is no restriction on thenumber of slave processors. Any number is possible.

Further, the macroblock number acquired by a slave processor may bedynamically determined by the operating system, may be staticallyuniquely determined by a compiler or hardware, or may be determined byany other method.

Further, it is possible to adopt a configuration in which the programsto be executed at the processors are stored in ROMs in advance and thenprovided to the parallel processing unit of the image encoding/decodingapparatus or to adopt a configuration in which the programs are storedon a storage medium such as a hard disk or CD-ROM and read into programRAMs or the like at the time of execution.

Further, in the present embodiment, as the processor according to thepresent invention, as shown in FIG. 1, a shared memory type parallelprocessing apparatus was shown as an example, but the hardwareconfiguration is not limited to this. A so-called “messagecommunication” type parallel processing apparatus not having a commonmemory and carrying out the transfer etc. of the data “messagecommunication” can be adopted as well.

Further, the invention is not restricted to a parallel processingapparatus in which processors are closely connected such as in thepresent embodiment and can also be applied to a apparatus comprised ofrespectively independent processors connected by any communication meansto cooperate and carry out some intended processing.

Namely, the actual configuration of the apparatus may be arbitrarilydetermined.

Further, the parallel processing unit of the image encoding/decodingapparatus was configured having a plurality of processors carrying outpredetermined operations according to certain programs operating inparallel to carry out the intended processing, but can also beconfigured having a plurality of processors comprised of dedicatedhardware operating in parallel. For example, the present invention canalso be applied to a circuit designed exclusively for variable lengthcoding/decoding such as the encoding/decoding circuit of the MPEG, animage coding DSP, or a media processor.

Further, in the present embodiment, DCT was used as the transform systemto be carried out at the encoding and decoding. However, any orthogonaltransform system can be used as the transform system. Any transform, forexample a Fourier transform such as a high speed Fourier transform (FET)and discrete Fourier transform (DFT), a Hadamard transform, and a K-Ltransform can be used.

Further, the present invention is not just applicable to the encodingand decoding of a moving picture as exemplified in the presentembodiment. For example, it can also be applied to the encoding anddecoding of audio data and text data and the encoding and the decodingof any other data.

Summarizing the advantageous effects of the present invention, asexplained above, according to the encoding apparatus and decoder of thepresent invention, when carrying out the encoding and the decoding of,for example, image data, the loads can be equally and efficientlydistributed among a plurality of processors and the communication forsynchronization among the processors and data communication can bereduced. As a result, the encoding and decoding can be carried out at ahigh speed, and the control method and the hardware configuration can besimplified.

Further, according to the encoding method and the decoding method of thepresent invention, when carrying out the encoding and the decoding offor example image data by the parallel processing using a plurality ofprocessors, the loads can be equally and efficiently distributed amongthe processors. Further, the communication for the synchronization amongthe processors and the data communication can be reduced. As a result,the encoding and decoding can be carried out at a high speed by easycontrol.

Further, the encoding method and the decoding method of the presentinvention are scalable methods in which the method of distribution ofloads does not depend upon the structure of the parallel processor, forexample, the number of the processors, so can be applied to parallelprocessors of a variety of configurations.

1-19. (canceled)
 20. A data processing method for data which is divided into blocks to carry out a predetermined operation consists of a first operation performed sequentially and a second operation performed in parallel, using a multi-processor system including a plurality of processor units, the method comprising: allotting said first operation to a first processor; allotting said second operation to a plurality of processors other than said first processor; and carrying out said allotted operations on each processor successively to perform said predetermined operation of each block in parallel.
 21. A data processing method as set forth in claim 1, wherein said first processor carries out said first operation of each block in an order corresponding to the order of said block.
 22. A data processing method as set forth in claim 2, wherein said first processor carries out said first operation of each block after said plurality of processors other than said first processor operating a predetermined part of said second operation of the corresponding block.
 23. A data processing method as set forth in claim 2, wherein said plurality of processors other than said first processor carry out said second operation of each block after said first operation.
 24. A data processing method as set forth in claim 2, wherein said first operation comprises variable length encoding or decoding.
 25. A data processing method as set forth in claim 2, wherein said block comprises a macro block or slice of image data.
 26. A data processing method as set forth in claim 2, wherein said multi-processor system has a plurality of processors carrying out predetermined operations according to certain programs operating in parallel.
 27. A data processing method as set forth in claim 2, wherein said multi-processor system has a plurality of processors comprised of dedicated hardware operating in parallel.
 28. A data processing method as set forth in claim 2, wherein said multi-processor system has a plurality of respectively independent processors connected by any communication means to cooperate and carry out some intended processing.
 29. A data processing apparatus for data which is divided into blocks to carry out a predetermined operation that consists of a first operation performed sequentially and a second operation performed in parallel, with a multi-processor system including a plurality of processor units, the apparatus comprising: a first processor for carrying out said first operation; and a plurality of processor other than said first processor for respectively carrying out said second operation, wherein each processor carries out each operation successively to perform said predetermined operation of each block in parallel.
 30. A data processing apparatus as set forth in claim 10, wherein said first processor carries out said first operation of each block in an order corresponding to the order of said block.
 31. A program for processing data divided into blocks to carry out a predetermined operation consists of a first operation performed sequentially and a second operation performed in parallel, using a multi-processor system including a plurality of processor units, comprising: allotting said first operation to first processor; allotting said second operation respectively to a plurality of processors other than said first processor; and carrying out said allotted operation to each processor successively to perform said predetermined operation of each block in parallel.
 32. A program for data processing as set forth in claim 12, wherein said first processor carries out said first operation of each block in order corresponding to the order of said block.
 33. A multi-processor system including a plurality of processor units for processing data divided into blocks to carry out a predetermined operation consists of a first operation performed sequentially and a second operation performed in parallel, comprising: a first processor for operating said first operation; and a plurality of processors other than said first processor for respectively operating said second operation, wherein each processor carries out each operation successively to perform said predetermined operation of each block in parallel.
 34. A multi-processor system as set forth in claim 14, wherein said first processor carries out said first operation of each block in order corresponding to the order of said block. 